Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 17 de 17
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
J Card Fail ; 2023 Dec 07.
Artigo em Inglês | MEDLINE | ID: mdl-38065306

RESUMO

BACKGROUND: Wild-type transthyretin amyloid cardiomyopathy (ATTRwt-CM), an increasingly recognized cause of heart failure (HF), often remains undiagnosed until later stages of the disease. METHODS AND RESULTS: A previously developed machine learning algorithm was simplified to create a random forest model based on 11 selected phenotypes predictive of ATTRwt-CM to estimate ATTRwt-CM risk in hypothetical patient scenarios. Using U.S. medical claims datasets (IQVIA), International Classification of Diseases codes were extracted to identify a training cohort of patients with ATTRwt-CM (cases) or nonamyloid HF (controls). After assessment in a 20% test sample of the training cohort, model performance was validated in cohorts of patients with International Classification of Diseases codes for ATTRwt-CM or cardiac amyloidosis vs nonamyloid HF derived from medical claims (IQVIA) or electronic health records (Optum). The simplified model performed well in identifying patients with ATTRwt-CM vs nonamyloid HF in the test sample, with an accuracy of 74%, sensitivity of 77%, specificity of 72%, and area under the curve of 0.82; robust performance was also observed in the validation cohorts. CONCLUSIONS: This simplified machine learning model accurately estimated the empirical probability of ATTRwt-CM in administrative datasets, suggesting it may serve as an easily implementable tool for clinical assessment of patient risk for ATTRwt-CM in the clinical setting. BRIEF LAY SUMMARY: Wild-type transthyretin amyloid cardiomyopathy (ATTRwt-CM for short) is a frequently overlooked cause of heart failure. Finding ATTRwt-CM early is important because the disease can worsen rapidly without treatment. Researchers developed a computer program that predicts the risk of ATTRwt-CM in patients with heart failure. In this study, the program was used to check for 11 medical conditions linked to ATTRwt-CM in the medical claims records of patients with heart failure. The program was 74% accurate in identifying ATTRwt-CM in patients with heart failure and was then used to develop an educational online tool for doctors (the wtATTR-CM estimATTR).

2.
Commun Med (Lond) ; 3(1): 189, 2023 Dec 20.
Artigo em Inglês | MEDLINE | ID: mdl-38123736

RESUMO

BACKGROUND: Primary immunodeficiency (PI) is a group of heterogeneous disorders resulting from immune system defects. Over 70% of PI is undiagnosed, leading to increased mortality, co-morbidity and healthcare costs. Among PI disorders, combined immunodeficiencies (CID) are characterized by complex immune defects. Common variable immunodeficiency (CVID) is among the most common types of PI. In light of available treatments, it is critical to identify adult patients at risk for CID and CVID, before the development of serious morbidity and mortality. METHODS: We developed a deep learning-based method (named "TabMLPNet") to analyze clinical history from nationally representative medical claims from electronic health records (Optum® data, covering all US), evaluated in the setting of identifying CID/CVID in adults. Further, we revealed the most important CID/CVID-associated antecedent phenotype combinations. Four large cohorts were generated: a total of 47,660 PI cases and (1:1 matched) controls. RESULTS: The sensitivity/specificity of TabMLPNet modeling ranges from 0.82-0.88/0.82-0.85 across cohorts. Distinctive combinations of antecedent phenotypes associated with CID/CVID are identified, consisting of respiratory infections/conditions, genetic anomalies, cardiac defects, autoimmune diseases, blood disorders and malignancies, which can possibly be useful to systematize the identification of CID and CVID. CONCLUSIONS: We demonstrated an accurate method in terms of CID and CVID detection evaluated on large-scale medical claims data. Our predictive scheme can potentially lead to the development of new clinical insights and expanded guidelines for identification of adult patients at risk for CID and CVID as well as be used to improve patient outcomes on population level.


Primary immunodeficiencies (PI) are disorders that weaken the immune system, increasing the incident of life-threatening infections, organ damage and the development of cancer and autoimmune diseases. Although PI is estimated to affect 1-2% of the global population, 70-90% of these patients remain undiagnosed. Many patients are diagnosed during adulthood, after other serious diseases have already developed. We developed a computational method to analyze the clinical history from a large group of people with and without PI. We focused on combined (CID) and common variable immunodeficiency (CVID), which are among the least studied and most common PI subtypes, respectively. We could identify people with CID or CVID and combinations of diseases and symptoms which could make it easier to identify CID or CVID. Our method could be used to more readily identify adults at risk of CID or CVID, enabling treatment to start earlier and their long-term health to be improved.

3.
BMJ Open ; 13(10): e070028, 2023 10 29.
Artigo em Inglês | MEDLINE | ID: mdl-37899155

RESUMO

OBJECTIVE: The aim of this study was to evaluate the potential real-world application of a machine learning (ML) algorithm, developed and trained on heart failure (HF) cohorts in the USA, to detect patients with undiagnosed wild type cardiac amyloidosis (ATTRwt) in the UK. DESIGN: In this retrospective observational study, anonymised, linked primary and secondary care data (Clinical Practice Research Datalink GOLD and Hospital Episode Statistics, respectively, were used to identify patients diagnosed with HF between 2009 and 2018 in the UK. International Classification of Diseases (ICD)-10 clinical modification codes were matched to equivalent Read (primary care) and ICD-10 WHO (secondary care) diagnosis codes used in the UK. In the absence of specific Read or ICD-10 WHO codes for ATTRwt, two proxy case definitions (definitive and possible cases) based on the degree of confidence that the contributing codes defined true ATTRwt cases were created using ML. PRIMARY OUTCOME MEASURE: Algorithm performance was evaluated primarily using the area under the receiver operating curve (AUROC) by comparing the actual versus algorithm predicted case definitions at varying sensitivities and specificities. RESULTS: The algorithm demonstrated strongest predictive ability when a combination of primary care and secondary care data were used (AUROC: 0.84 in definitive cohort and 0.86 in possible cohort). For primary care or secondary care data alone, performance ranged from 0.68 to 0.78. CONCLUSION: The ML algorithm, despite being developed in a US population, was effective at identifying patients that may have ATTRwt in a UK setting. Its potential use in research and clinical care to aid identification of patients with undiagnosed ATTRwt, possibly enabling earlier diagnosis in the disease pathway, should be investigated.


Assuntos
Neuropatias Amiloides Familiares , Cardiomiopatias , Insuficiência Cardíaca , Humanos , Pré-Albumina/metabolismo , Neuropatias Amiloides Familiares/diagnóstico , Neuropatias Amiloides Familiares/complicações , Insuficiência Cardíaca/diagnóstico , Insuficiência Cardíaca/complicações , Cardiomiopatias/diagnóstico , Cardiomiopatias/complicações , Reino Unido
4.
Am Heart J ; 265: 22-30, 2023 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-37400049

RESUMO

BACKGROUND: An 11-factor random forest model has been developed among ambulatory heart failure (HF) patients for identifying potential wild-type amyloidogenic TTR cardiomyopathy (wtATTR-CM). The model has not been evaluated in a large sample of patients hospitalized for HF. METHODS: This study included Medicare beneficiaries aged ≥65 years hospitalized for HF in the Get With The Guidelines-HF® Registry from 2008-2019. Patients with and without a diagnosis of ATTR-CM were compared, as defined by inpatient and outpatient claims data within 6 months pre- or post-index hospitalization. Within a cohort matched 1:1 by age and sex, univariable logistic regression was used to evaluate relationships between ATTR-CM and each of the 11 factors of the established model. Discrimination and calibration of the 11-factor model were assessed. RESULTS: Among 205,545 patients (median age 81 years) hospitalized for HF across 608 hospitals, 627 patients (0.31%) had a diagnosis code for ATTR-CM. Univariable analysis within the 1:1 matched cohort of each of the 11-factors in the ATTR-CM model found pericardial effusion, carpal tunnel syndrome, lumbar spinal stenosis, and elevated serum enzymes (e.g., troponin elevation) to be strongly associated with ATTR-CM. The 11-factor model showed modest discrimination (c-statistic 0.65) and good calibration within the matched cohort. CONCLUSIONS: Among US patients hospitalized for HF, the number of patients with ATTR-CM defined by diagnosis codes on an inpatient/outpatient claim within 6 months of admission was low. Most factors within the prior 11-factor model were associated with greater odds of ATTR-CM diagnosis. In this population, the ATTR-CM model demonstrated modest discrimination.

5.
Clin Med Insights Cardiol ; 16: 11795468221133608, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-36386406

RESUMO

Background: Wild-type transthyretin amyloid cardiomyopathy (ATTR-CM) is a frequently under-recognized cause of heart failure (HF) in older patients. To improve identification of patients at risk for the disease, we initiated a pilot program in which 9 cardiac/non-cardiac phenotypes and 20 high-performing phenotype combinations predictive of wild-type ATTR-CM were operationalized in electronic health record (EHR) configurations at a large academic medical center. Methods: Inclusion criteria were age >50 years and HF; exclusion criteria were end-stage renal disease and prior amyloidosis diagnoses. The different Epic EHR configurations investigated were a clinical decision support tool (Best Practice Advisory) and operational/analytical reports (Clarity™, Reporting Workbench™, and SlicerDicer); the different data sources employed were problem list, visit diagnosis, medical history, and billing transactions. Results: With Clarity, among 45 051 patients with HF, 4006 patients (8.9%) had ⩾1 phenotype combination associated with increased risk of wild-type ATTR-CM. Across all data sources, 2 phenotypes (cardiomegaly; osteoarthrosis) and 2 combinations (carpal tunnel syndrome + HF; atrial fibrillation + heart block + cardiomegaly + osteoarthrosis) generated the highest proportions of patients for wild-type ATTR-CM screening. Conclusion: All EHR configurations tested were capable of operationalizing phenotypes or phenotype combinations to identify at-risk patients; the Clarity report was the most comprehensive.

6.
Nat Commun ; 12(1): 2725, 2021 05 11.
Artigo em Inglês | MEDLINE | ID: mdl-33976166

RESUMO

Transthyretin amyloid cardiomyopathy, an often unrecognized cause of heart failure, is now treatable with a transthyretin stabilizer. It is therefore important to identify at-risk patients who can undergo targeted testing for earlier diagnosis and treatment, prior to the development of irreversible heart failure. Here we show that a random forest machine learning model can identify potential wild-type transthyretin amyloid cardiomyopathy using medical claims data. We derive a machine learning model in 1071 cases and 1071 non-amyloid heart failure controls and validate the model in three nationally representative cohorts (9412 cases, 9412 matched controls), and a large, single-center electronic health record-based cohort (261 cases, 39393 controls). We show that the machine learning model performs well in identifying patients with cardiac amyloidosis in the derivation cohort and all four validation cohorts, thereby providing a systematic framework to increase the suspicion of transthyretin cardiac amyloidosis in patients with heart failure.


Assuntos
Neuropatias Amiloides Familiares/metabolismo , Cardiomiopatias/metabolismo , Insuficiência Cardíaca/metabolismo , Aprendizado de Máquina , Pré-Albumina/metabolismo , Neuropatias Amiloides Familiares/genética , Cardiomiopatias/genética , Registros Eletrônicos de Saúde , Insuficiência Cardíaca/genética , Humanos , Pré-Albumina/genética
7.
Transcr Open Access ; 1(1)2013 Jun 19.
Artigo em Inglês | MEDLINE | ID: mdl-24860841

RESUMO

BACKGROUND: Transposable Elements (TEs) have long been regarded as selfish or junk DNA having little or no role in the regulation or functioning of the human genome. However, over the past several years this view came to be challenged as several studies provided anecdotal as well as global evidence for the contribution of TEs to the regulatory and coding needs of human genes. In this study, we explored the incorporation and epigenetic regulation of coding sequences donated by TEs using gene expression and other ancillary genomics data from two human hematopoietic cell-lines: GM12878 (a lymphoblastoid cell line) and K562 (a Chronic Myelogenous Leukemia cell line). In each cell line, we found several thousand instances of TEs donating coding sequences to human genes. We compared the transcriptome assembly of the RNA sequencing (RNA-Seq) reads with and without the aid of a reference transcriptome and found that the percentage of genes that incorporate TEs in their coding sequences is significantly greater than that obtained from the reference transcriptome assemblies using Refseq and Gencode gene models. We also used histone modifications chromatin immunoprecipitation sequencing (ChIP-Seq) data, Cap Analysis of Gene Expression (CAGE) data and DNAseI Hypersensitivity Site (DHS) data to demonstrate the epigenetic regulation of the TE derived coding sequences. Our results suggest that TEs form a significantly higher percentage of coding sequences than represented in gene annotation databases and these TE derived sequences are epigenetically regulated in accordance with their expression in the two cell types.

8.
PLoS One ; 7(3): e34286, 2012.
Artigo em Inglês | MEDLINE | ID: mdl-22479588

RESUMO

Gene expression quantitative trait loci (eQTL) are useful for identifying single nucleotide polymorphisms (SNPs) associated with diseases. At times, a genetic variant may be associated with a master regulator involved in the manifestation of a disease. The downstream target genes of the master regulator are typically co-expressed and share biological function. Therefore, it is practical to screen for eQTLs by identifying SNPs associated with the targets of a transcript-regulator (TR). We used a multivariate regression with the gene expression of known targets of TRs and SNPs to identify TReQTLs in European (CEU) and African (YRI) HapMap populations. A nominal p-value of <1×10(-6) revealed 234 SNPs in CEU and 154 in YRI as TReQTLs. These represent 36 independent (tag) SNPs in CEU and 39 in YRI affecting the downstream targets of 25 and 36 TRs respectively. At a false discovery rate (FDR) = 45%, one cis-acting tag SNP (within 1 kb of a gene) in each population was identified as a TReQTL. In CEU, the SNP (rs16858621) in Pcnxl2 was found to be associated with the genes regulated by CREM whereas in YRI, the SNP (rs16909324) was linked to the targets of miRNA hsa-miR-125a. To infer the pathways that regulate expression, we ranked TReQTLs by connectivity within the structure of biological process subtrees. One TReQTL SNP (rs3790904) in CEU maps to Lphn2 and is associated (nominal p-value = 8.1×10(-7)) with the targets of the X-linked breast cancer suppressor Foxp3. The structure of the biological process subtree and a gene interaction network of the TReQTL revealed that tumor necrosis factor, NF-kappaB and variants in G-protein coupled receptors signaling may play a central role as communicators in Foxp3 functional regulation. The potential pleiotropic effect of the Foxp3 TReQTLs was gleaned from integrating mRNA-Seq data and SNP-set enrichment into the analysis.


Assuntos
Polimorfismo de Nucleotídeo Único , Locos de Características Quantitativas , População Negra/genética , Mapeamento Cromossômico/métodos , Reações Falso-Positivas , Variação Genética , Genoma , Estudo de Associação Genômica Ampla , Genótipo , Humanos , Modelos Biológicos , Modelos Genéticos , NF-kappa B/metabolismo , Análise de Sequência com Séries de Oligonucleotídeos , Fenótipo , Análise de Regressão , Transdução de Sinais , População Branca/genética
9.
PLoS One ; 6(11): e27513, 2011.
Artigo em Inglês | MEDLINE | ID: mdl-22087331

RESUMO

Experimentally characterized enhancer regions have previously been shown to display specific patterns of enrichment for several different histone modifications. We modelled these enhancer chromatin profiles in the human genome and used them to guide the search for novel enhancers derived from transposable element (TE) sequences. To do this, a computational approach was taken to analyze the genome-wide histone modification landscape characterized by the ENCODE project in two human hematopoietic cell types, GM12878 and K562. We predicted the locations of 2,107 and 1,448 TE-derived enhancers in the GM12878 and K562 cell lines respectively. A vast majority of these putative enhancers are unique to each cell line; only 3.5% of the TE-derived enhancers are shared between the two. We evaluated the functional effect of TE-derived enhancers by associating them with the cell-type specific expression of nearby genes, and found that the number of TE-derived enhancers is strongly positively correlated with the expression of nearby genes in each cell line. Furthermore, genes that are differentially expressed between the two cell lines also possess a divergent number of TE-derived enhancers in their vicinity. As such, genes that are up-regulated in the GM12878 cell line and down-regulated in K562 have significantly more TE-derived enhancers in their vicinity in the GM12878 cell line and vice versa. These data indicate that human TE-derived sequences are likely to be involved in regulating cell-type specific gene expression on a broad scale and suggest that the enhancer activity of TE-derived sequences is mediated by epigenetic regulatory mechanisms.


Assuntos
Cromatina/metabolismo , Elementos de DNA Transponíveis , Expressão Gênica , Genoma Humano , Fatores de Transcrição , Linhagem Celular , Epigênese Genética , Células Precursoras Eritroides , Histonas , Humanos , Células K562 , Métodos
10.
Genome Biol Evol ; 3: 259-71, 2011.
Artigo em Inglês | MEDLINE | ID: mdl-21362639

RESUMO

Independent lines of investigation have documented effects of both transposable elements (TEs) and gene length (GL) on gene expression. However, TE gene fractions are highly correlated with GL, suggesting that they cannot be considered independently. We evaluated the TE environment of human genes and GL jointly in an attempt to tease apart their relative effects. TE gene fractions and GL were compared with the overall level of gene expression and the breadth of expression across tissues. GL is strongly correlated with overall expression level but weakly correlated with the breadth of expression, confirming the selection hypothesis that attributes the compactness of highly expressed genes to selection for economy of transcription. However, TE gene fractions overall, and for the L1 family in particular, show stronger anticorrelations with expression level than GL, indicating that GL may not be the most important target of selection for transcriptional economy. These results suggest a specific mechanism, removal of TEs, by which highly expressed genes are selectively tuned for efficiency. MIR elements are the only family of TEs with gene fractions that show a positive correlation with tissue-specific expression, suggesting that they may provide regulatory sequences that help to control human gene expression. Consistent with this notion, MIR fractions are relatively enriched close to transcription start sites and associated with coexpression in specific sets of related tissues. Our results confirm the overall relevance of the TE environment to gene expression and point to distinct mechanisms by which different TE families may contribute to gene regulation.


Assuntos
Elementos de DNA Transponíveis , Regulação da Expressão Gênica , Expressão Gênica , Genoma Humano , Análise por Conglomerados , DNA Intergênico , Bases de Dados Genéticas , Humanos , Modelos Lineares , Modelos Genéticos , Sítio de Iniciação de Transcrição
11.
Gene ; 475(1): 39-48, 2011 Apr 01.
Artigo em Inglês | MEDLINE | ID: mdl-21215797

RESUMO

It was previously thought that epigenetic histone modifications of mammalian transposable elements (TEs) serve primarily to defend the genome against deleterious effects associated with their activity. However, we recently showed that, genome-wide, human TEs can also be epigenetically modified in a manner consistent with their ability to regulate host genes. Here, we explore the ability of TE sequences to epigenetically regulate individual human genes by focusing on the histone modifications of promoter sequences derived from TEs. We found 1520 human genes that initiate transcription from within TE-derived promoter sequences. We evaluated the distributions of eight histone modifications across these TE-promoters, within and between the GM12878 and K562 cell lines, and related their modification status with the cell-type specific expression patterns of the genes that they regulate. TE-derived promoters are significantly enriched for active histone modifications, and depleted for repressive modifications, relative to the genomic background. Active histone modifications of TE-promoters peak at transcription start sites and are positively correlated with increasing expression within cell lines. Furthermore, differential modification of TE-derived promoters between cell lines is significantly correlated with differential gene expression. LTR-retrotransposon derived promoters in particular play a prominent role in mediating cell-type specific gene regulation, and a number of these LTR-promoter genes are implicated in lineage-specific cellular functions. The regulation of human genes mediated by histone modifications targeted to TE-derived promoters is consistent with the ability of TEs to contribute to the epigenomic landscape in a way that provides functional utility to the host genome.


Assuntos
Elementos de DNA Transponíveis/genética , Epigenômica , Regiões Promotoras Genéticas/genética , Linhagem Celular , Estudo de Associação Genômica Ampla , Histonas/metabolismo , Humanos
12.
Bioinformatics ; 26(20): 2501-8, 2010 Oct 15.
Artigo em Inglês | MEDLINE | ID: mdl-20871106

RESUMO

MOTIVATION: Chromatin immunoprecipitation followed by high-throughput sequencing (ChIP-seq) is widely used in biological research. ChIP-seq experiments yield many ambiguous tags that can be mapped with equal probability to multiple genomic sites. Such ambiguous tags are typically eliminated from consideration resulting in a potential loss of important biological information. RESULTS: We have developed a Gibbs sampling-based algorithm for the genomic mapping of ambiguous sequence tags. Our algorithm relies on the local genomic tag context to guide the mapping of ambiguous tags. The Gibbs sampling procedure we use simultaneously maps ambiguous tags and updates the probabilities used to infer correct tag map positions. We show that our algorithm is able to correctly map more ambiguous tags than existing mapping methods. Our approach is also able to uncover mapped genomic sites from highly repetitive sequences that can not be detected based on unique tags alone, including transposable elements, segmental duplications and peri-centromeric regions. This mapping approach should prove to be useful for increasing biological knowledge on the too often neglected repetitive genomic regions. AVAILABILITY: http://esbg.gatech.edu/jordan/software/map CONTACT: king.jordan@biology.gatech.edu SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Algoritmos , Biologia Computacional/métodos , Análise de Sequência de DNA/métodos , Sitios de Sequências Rotuladas , Sequência de Bases , Imunoprecipitação da Cromatina , Elementos de DNA Transponíveis/genética , Genoma
13.
Mob DNA ; 1(1): 2, 2010 Jan 25.
Artigo em Inglês | MEDLINE | ID: mdl-20226072

RESUMO

BACKGROUND: Transposition is disruptive in nature and, thus, it is imperative for host genomes to evolve mechanisms that suppress the activity of transposable elements (TEs). At the same time, transposition also provides diverse sequences that can be exapted by host genomes as functional elements. These notions form the basis of two competing hypotheses pertaining to the role of epigenetic modifications of TEs in eukaryotic genomes: the genome defense hypothesis and the exaptation hypothesis. To date, all available evidence points to the genome defense hypothesis as the best explanation for the biological role of TE epigenetic modifications. RESULTS: We evaluated several predictions generated by the genome defense hypothesis versus the exaptation hypothesis using recently characterized epigenetic histone modification data for the human genome. To this end, we mapped chromatin immunoprecipitation sequence tags from 38 histone modifications, characterized in CD4+ T cells, to the human genome and calculated their enrichment and depletion in all families of human TEs. We found that several of these families are significantly enriched or depleted for various histone modifications, both active and repressive. The enrichment of human TE families with active histone modifications is consistent with the exaptation hypothesis and stands in contrast to previous analyses that have found mammalian TEs to be exclusively repressively modified. Comparisons between TE families revealed that older families carry more histone modifications than younger ones, another observation consistent with the exaptation hypothesis. However, data from within family analyses on the relative ages of epigenetically modified elements are consistent with both the genome defense and exaptation hypotheses. Finally, TEs located proximal to genes carry more histone modifications than the ones that are distal to genes, as may be expected if epigenetically modified TEs help to regulate the expression of nearby host genes. CONCLUSIONS: With a few exceptions, most of our findings support the exaptation hypothesis for the role of TE epigenetic modifications when vetted against the genome defense hypothesis. The recruitment of epigenetic modifications may represent an additional mechanism by which TEs can contribute to the regulatory functions of their host genomes.

14.
Ann N Y Acad Sci ; 1178: 276-84, 2009 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-19845643

RESUMO

Transposable element (TE) sequences make up a substantial fraction of mammalian genomes and exert a variety of regulatory influences on mammalian genes. We explore the contributions of TEs to the epigenetic mechanisms that regulate mammalian genomes, emphasizing nucleosome positioning and epigenetic histone modifications. A link between TEs and epigenetics rests on the fact that underlying genetic sequences partially mediate the nature and identity of epigenetic modifications. Here, we review the studies that have uncovered histone modifications that are targeted to mammalian TE sequences and propose a series of hypotheses regarding the potential epigenetic regulatory effects of mammalian TEs. We propose that mammalian TE sequences have specific nucleosome binding properties with regulatory implications for nearby genes, are involved in the phasing of nucleosomes, and recruit epigenetic modifications to function as enhancers; that epigenetic modifications at TE sequences affect the regulation of nearby genes; and that TEs serve as epigenetic boundary elements. It is hoped that these proposed scenarios may help to serve as a roadmap for future investigations into the epigenetic regulatory effects of mammalian TEs.


Assuntos
Elementos de DNA Transponíveis/genética , Epigênese Genética , Genoma , Animais , Humanos , Nucleossomos/metabolismo
15.
Methods Mol Biol ; 537: 323-36, 2009.
Artigo em Inglês | MEDLINE | ID: mdl-19378152

RESUMO

Eukaryotic genomes are full of repetitive DNA, transposable elements (TEs) in particular, and accordingly there are a number of computational methods that can be used to identify TEs from genomic sequences. We present here a survey of two of the most readily available and widely used bioinformatics applications for the detection, characterization, and analysis of TE sequences in eukaryotic genomes: CENSOR and RepeatMasker. For each program, information on availability, input, output, and the algorithmic methods used is provided. Specific examples of the use of CENSOR and RepeatMasker are also described. CENSOR and RepeatMasker both rely on homology-based methods for the detection of TE sequences. There are several other classes of methods available for the analysis of repetitive DNA sequences including de novo methods that compare genomic sequences against themselves, class-specific methods that use structural characteristics of specific classes of elements to aid in their identification, and pipeline methods that combine aspects of some or all of the aforementioned methods. We briefly consider the strengths and weaknesses of these different classes of methods with an emphasis on their complementary utility for the analysis of repetitive DNA in eukaryotes.


Assuntos
Elementos de DNA Transponíveis/genética , Análise de Sequência de DNA/métodos , Software , Sequência de Bases , Biologia Computacional , Dados de Sequência Molecular , Alinhamento de Sequência , Interface Usuário-Computador
16.
Gene ; 436(1-2): 12-22, 2009 May 01.
Artigo em Inglês | MEDLINE | ID: mdl-19393174

RESUMO

We evaluated the epigenetic contributions of repetitive DNA elements to human gene regulation. Human proximal promoter sequences show distinct distributions of transposable elements (TEs) and simple sequence repeats (SSRs). TEs are enriched distal from transcriptional start sites (TSSs) and their frequency decreases closer to TSSs, being largely absent from the core promoter region. SSRs, on the other hand, are found at low frequency distal to the TSS and then increase in frequency starting approximately 150 bp upstream of the TSS. The peak of SSR density is centered around the -35 bp position where the basal transcriptional machinery assembles. These trends in repetitive sequence distribution are strongly correlated, positively for TEs and negatively for SSRs, with relative nucleosome binding affinities along the promoters. Nucleosomes bind with highest probability distal from the TSS and the nucleosome binding affinity steadily decreases reaching its nadir just upstream of the TSS at the same point where SSR frequency is at its highest. Promoters that are enriched for TEs are more highly and broadly expressed, on average, than promoters that are devoid of TEs. In addition, promoters that have similar repetitive DNA profiles regulate genes that have more similar expression patterns and encode proteins with more similar functions than promoters that differ with respect to their repetitive DNA. Furthermore, distinct repetitive DNA promoter profiles are correlated with tissue-specific patterns of expression. These observations indicate that repetitive DNA elements mediate chromatin accessibility in proximal promoter regions and the repeat content of promoters is relevant to both gene expression and function.


Assuntos
Regulação da Expressão Gênica , Nucleossomos/metabolismo , Regiões Promotoras Genéticas/genética , Sequências Repetitivas de Ácido Nucleico/genética , Análise de Variância , Animais , Sítios de Ligação , Ligação Competitiva , Análise por Conglomerados , Elementos de DNA Transponíveis/genética , Perfilação da Expressão Gênica , Humanos , Sítio de Iniciação de Transcrição
17.
Biol Direct ; 3: 9, 2008 Mar 24.
Artigo em Inglês | MEDLINE | ID: mdl-18361801

RESUMO

We analyzed the chicken (Gallus gallus) genome sequence to search for previously uncharacterized endogenous retrovirus (ERV) sequences using ab initio and combined evidence approaches. We discovered 11 novel families of ERVs that occupy more than 21 million base pairs, approximately 2%, of the chicken genome. These novel families include a number of recently active full-length elements possessing identical long terminal repeats (LTRs) as well as intact gag and pol open reading frames. The abundance and diversity of chicken ERVs we discovered underscore the utility of an approach that combines multiple methods for the identification of interspersed repeats in vertebrate genomes.


Assuntos
Galinhas/genética , Galinhas/virologia , Retrovirus Endógenos/genética , Genoma , Família Multigênica , Animais , Biologia Computacional , Retrovirus Endógenos/química , Humanos , Filogenia , Análise de Sequência de DNA , Software , Sequências Repetidas Terminais
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...